feat(engine): basic math expressions #19407

spiridonov · 2025-10-07T18:43:56Z

What this PR does / why we need it:

In order to add support for rate() aggregation function, which will be implemented as count_over_time/$interval, I had to add support for math expressions. Binary expressions with only single input are supported for now. Things like sum_over_time/count_over_time will be implemented later. There is an optimization to merge several math expression nodes into one in the physical plan.

Minor:

fixed some typos

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Diff for pkg/engine/internal/planner/logical/planner.go is ugly here, it is better to view the new file as the whole to understand it better. I basically split one large function into 3 pieces without changing much in that logic.

Checklist

Reviewed the CONTRIBUTING.md guide (required)
Documentation added
Tests updated
Title matches the required conventional commits format, see here
- Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

…ggregate

…nto spiridonov-rate-aggregate

…ggregate

…nto spiridonov-rate-aggregate

chaudum

I would appreciate if you could split the PR into the binop expression implementation and the rate() implementation.

…ggregate

…nto spiridonov-rate-aggregate

spiridonov · 2025-10-14T17:26:32Z

@chaudum I removed rate() changes from this PR, now it is just math expressions. PTAL.

chaudum · 2025-10-15T06:14:01Z

pkg/engine/internal/planner/logical/planner.go

+	var vecAggType types.VectorAggregationType
+	switch e.Operation {
+	//case syntax.OpTypeCount:
+	//	vecAggType = types.VectorAggregationTypeCount
+	case syntax.OpTypeSum:
+		vecAggType = types.VectorAggregationTypeSum
+	//case syntax.OpTypeMax:
+	//	vecAggType = types.VectorAggregationTypeMax
+	//case syntax.OpTypeMin:
+	//	vecAggType = types.VectorAggregationTypeMin
+	default:
 		return nil, errUnimplemented
 	}


I know that you did not change anything here, but could we follow the pattern that you introduced and move this into a separate function convertVectorAggType(op string) types.VectorAggregationType?

Same for the RangeAggregationType

chaudum · 2025-10-15T06:22:17Z

pkg/engine/internal/planner/logical/planner_test.go

-		end:       7200,
-		interval:  5 * time.Minute,
-	}
+	t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-.*"} |= "metric.go"[5m]))`, func(t *testing.T) {


nit: make name descriptive

Suggested change

t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-.*"} |= "metric.go"[5m]))`, func(t *testing.T) {

t.Run("simple metric query", func(t *testing.T) {

chaudum · 2025-10-15T06:23:01Z

pkg/engine/internal/planner/logical/planner_test.go

+		t.Logf("\n%s\n", sb.String())
+	})
+
+	t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-.*"}[5m]) / 300)`, func(t *testing.T) {


nit: make name descriptive

Suggested change

t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-.*"}[5m]) / 300)`, func(t *testing.T) {

t.Run("binop metric query", func(t *testing.T) {

chaudum · 2025-10-15T06:25:48Z

pkg/engine/internal/planner/logical/planner_test.go

+%7 = LT builtin.timestamp 1970-01-01T02:00:00Z
+%8 = SELECT %6 [predicate=%7]
+%9 = RANGE_AGGREGATION %8 [operation=count, start_ts=1970-01-01T01:00:00Z, end_ts=1970-01-01T02:00:00Z, step=0s, range=5m0s]
+%10 = DIV %9 300


nit: I wonder whether we should change the string representation of a literal to something like

Suggested change

%10 = DIV %9 300

%10 = DIV %9 LITERAL(300)

or

Suggested change

%10 = DIV %9 300

%10 = DIV %9 INT64(300)

Makes sense, overall. But I would make that change in pkg/engine/internal/types/literal.go for each literal type, and that will cause a lot of diffs in all tests where literals are used (mostly in predicates). This is too much for this PR and kinda unrelated.

Yeah, it's just a small thing regarding representation. Does not need to be addressed in this PR

chaudum · 2025-10-15T07:19:12Z

pkg/engine/internal/types/aggregations.go

 var SupportedVectorAggregationTypes = []VectorAggregationType{
-	VectorAggregationTypeSum, VectorAggregationTypeMax, VectorAggregationTypeMin, VectorAggregationTypeCount,
+	VectorAggregationTypeSum,
 }


Is this anywhere used?

not anymore. fixed. also removed SupportedRangeAggregationTypes because now the name is confusing with its only usage.

chaudum · 2025-10-15T07:31:39Z

pkg/engine/internal/executor/math_expression.go

+			fields = append(fields, arrow.Field{
+				Name: fmt.Sprintf("float64.generated.input_%d", i),
+				Type: types.Arrow.Float,
+			})


fyi: You can use the semconv package to generate the field definition, either by using semconv.FieldFromFQN() or semconv.FieldFromIdent()

chaudum · 2025-10-15T07:40:34Z

pkg/engine/internal/executor/math_expression.go

+)
+
+func NewMathExpressionPipeline(expr *physical.MathExpression, inputs []Pipeline, evaluator expressionEvaluator) *GenericPipeline {
+	return newGenericPipeline(Local, func(ctx context.Context, inputs []Pipeline) (arrow.Record, error) {


I don't think your implementation of the pipeline is correct.

What this pipeline is doing is returning a new arrow.Record with fields float64.generated.input_0 and timestamp_ns.builtin.timestamp, instead of preserving the existing column names.

While this works for a specific subset of queries that have a vector aggregation without grouping, it does not work when there is a vector aggregation with grouping, e.g. sum by (cluster, namespace) (count_over_time({env="prod"} [1m])) / 100).

So instead of generating a new schema, you "only" need to apply the math function on the float64.generated.value (semconv.ColumnIdentValue) column.

This pipeline takes its inputs (just 1 for now, but multiple in general) and makes input_0, input_1 etc columns to evaluate the given math expression. The math expression is in the form of input_0 / 90 * 199 + input_1 regardless of what exactly those input pipelines are. After evaluation is done this pipeline returns two columns value and ts. This is a bug that I miss other columns (for groupings) in the output, and I will fix this. But it does not return input_0 in any case.

chaudum · 2025-10-15T07:46:40Z

pkg/engine/internal/executor/functions.go

+type genericFloat64Function[E arrayType[T], T comparable] struct {
+	eval func(a, b T) (float64, error)
+}


Could this be expressed as a generic function

Suggested change

type genericFloat64Function[E arrayType[T], T comparable] struct {

eval func(a, b T) (float64, error)

}

type genericFunction[E arrayType[T], T comparable, R any] struct {

eval func(a, b T) (R, error)

}

I have not found a clean way to do that in a generic way. There are some pretty specific type coercions based on the result type, plus pow and mod are implemented totally differently.

chaudum · 2025-10-15T09:54:38Z

pkg/engine/internal/executor/functions.go

+	lhsArr, ok := lhs.ToArray().(E)
+	if !ok {
+		return nil, arrow.ErrType
+	}


The array needs to be released

Suggested change

lhsArr, ok := lhs.ToArray().(E)

if !ok {

return nil, arrow.ErrType

}

lhsArr, ok := lhs.ToArray().(E)

if !ok {

return nil, arrow.ErrType

}

defer lhsArr.Release()

Same for rhsArr.

I basically reverted #19496 and not it is not needed anymore.

…ggregate

chaudum · 2025-10-16T06:58:33Z

pkg/engine/internal/executor/math_expression.go

+		defer valCol.Release()
+
+		schema := batch.Schema()
+		valueCol := semconv.NewIdentifier(types.ColumnNameGeneratedValue, types.ColumnTypeGenerated, types.Loki.Float)


Instead of valueCol you can use semconv.ColumnIdentValue

chaudum · 2025-10-16T07:09:42Z

pkg/engine/internal/executor/math_expression.go

+		fields := make([]arrow.Field, 0, len(inputs))
+		for i := range inputs {
+			fields = append(fields, semconv.FieldFromFQN(fmt.Sprintf("float64.generated.input_%d", i), false))
+		}


It is a bit confusing that for the fields you iterate over the inputs, but for the cols you do an explicit single append().

I think it would be clearer if you also only append a single field.

chaudum · 2025-10-16T07:10:49Z

pkg/engine/internal/executor/math_expression_test.go

+		schema := arrow.NewSchema([]arrow.Field{
+			semconv.FieldFromFQN(colTs, false),
+			semconv.FieldFromFQN(colVal, false),
+		}, nil)


Can you extend the test so it also has additional (grouping) columns?

added two more labels into the test

…ggregate

…nto spiridonov-rate-aggregate

rfratto · 2025-10-16T19:09:31Z

I'm having a hard time following the input_N concept. Conventionally, other nodes use an expression tree (the LHS of a BinOp can be another BinOp), but I see that in processBinOp, both side of the binary operation are only ever a literal or a column reference (which I believe is the reason for the the input_N concept). Is there a reason we have to do it this way?

Also: it seems like MathExpression could be represented as a projection. Do we need MathExpression as a distinct concept?

spiridonov · 2025-10-16T19:24:42Z

If an expression is, for example, A + B, both of them come from some subqueries (and from different inputs children of the given pipeline), so both A and B will have a value column in them. This code takes input pipelines in the correct order, extracts their value columns and puts them into evalInput under input_N names. This way I can run eval() with the expression (unchanged) and multiple inputs. This might be confusing, I agree, but I did not find another way to run eval() on multiple input pipelines with less magic.

I think there should be a separate concept for math expressions, especially when more complex cases are gonna be implemented (such as sum_over_time/count_over_time) that require reading both children pipelines and aligning them by timestamps before feeding into eval(). This should not belong to a projection.

cc @rfratto

rfratto · 2025-10-16T20:25:04Z

Thanks, that makes sense if you're trying to prepare for supporting math over two vectors.

I suspect computations over two vectors (in a way that's compatible with LogQL/PromQL) is going to be trickier than it seems. I believe, in relational algebra terms, they're expressed as a combination of an inner join and a projection.

So, given a query like

(
  sum by (job) (rate({namespace="dev"}[$__auto])
) + (
  sum by (job) (rate({namespace="qa"}[$__auto])
)

A physical plan mapping literally to the algebra could be

Projection expressions=[
  left.timestamp as timestamp,
  left.job AS job, 
  left.value + right.value as value,
] 
  InnerJoin left_prefix="left" right_prefix="right" on=[
    left.timestamp = right.timestamp && left.job = right.job 
  ] 
    VectorAggregate op=sum groupings=[job] # left-hand side 
      RangeAggregate op=rate  
        DataObjScan 
    VectorAggregate op=sum groupings=[job] # right-hand side 
      RangeAggregate op=rate 
        DataObjScan

This needs to explicitly be an inner join, since LogQL's metric queries require the sample to exist on both side of the expression (this matches PromQL's behaviour).

Example

For example, the two inputs of OuterJoin

ts	job	value
0	loki	5
10	mimir	10

and

ts	job	value
0	loki	15
10	tempo	10

is joined into

left.ts	left.job	left.value	right.ts	right.job	right.value
0	loki	5	0	loki	15

and is projected to

unix_ts	job	value
0	loki	20

I think it's probably okay if we wanted to have a node which combines the work of projection and inner joins, though I do think it's possible to represent these operations using projections, which we will have a separate node for, and separating them out may be easier to understand in the plans.

All that said, I do wonder if math on two vectors is going to require a lot more thought. Would we be able to simplify the logic here if we descoped that from our consideration?

…ggregate

spiridonov added 5 commits October 6, 2025 11:22

merge:

c3fe4d1

math expressions

605933c

merge

3b5d6a4

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

7978ff5

…ggregate

fix merge

e2e6e87

spiridonov requested a review from a team as a code owner October 7, 2025 18:43

pull-request-size bot added the size/XXL label Oct 7, 2025

spiridonov enabled auto-merge (squash) October 7, 2025 18:44

spiridonov added 4 commits October 7, 2025 14:49

format

5a238b8

format

d88a7e0

memory leaks

a1bf376

lint

eedbfaf

spiridonov disabled auto-merge October 7, 2025 21:07

spiridonov added 6 commits October 8, 2025 10:20

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

36f3540

…ggregate

merge

1ef3a95

wip

49c5215

revert go.mod

88892c7

revert go.mod

a2147e1

revert go.mod

a10228a

spiridonov marked this pull request as draft October 8, 2025 18:44

spiridonov added 6 commits October 9, 2025 11:45

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

14265a2

…ggregate

merge

ffce0da

release

357450a

comment

1ce7c78

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

75d4ab4

…ggregate

Merge branch 'main' into spiridonov-rate-aggregate

0ae79aa

spiridonov marked this pull request as ready for review October 9, 2025 16:09

spiridonov enabled auto-merge (squash) October 9, 2025 16:09

spiridonov added 2 commits October 9, 2025 12:14

do.mod

7c556fb

Merge branch 'spiridonov-rate-aggregate' of github.com:grafana/loki i…

206ac82

…nto spiridonov-rate-aggregate

spiridonov added 4 commits October 9, 2025 15:29

Merge branch 'main' into spiridonov-rate-aggregate

dd74932

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

fbad3d0

…ggregate

Merge branch 'spiridonov-rate-aggregate' of github.com:grafana/loki i…

0c1453f

…nto spiridonov-rate-aggregate

Merge branch 'main' into spiridonov-rate-aggregate

2fede5f

chaudum reviewed Oct 14, 2025

View reviewed changes

spiridonov added 4 commits October 14, 2025 12:42

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

de1851c

…ggregate

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

42219bb

…ggregate

removed rate

efcb3e8

Merge branch 'spiridonov-rate-aggregate' of github.com:grafana/loki i…

51e100e

…nto spiridonov-rate-aggregate

spiridonov changed the title ~~feat(engine): rate aggregation and basic math expressions~~ feat(engine): basic math expressions Oct 14, 2025

chaudum requested changes Oct 15, 2025

View reviewed changes

chaudum reviewed Oct 15, 2025

View reviewed changes

spiridonov added 5 commits October 15, 2025 09:51

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

f78d9f1

…ggregate

pr comments

7cae847

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

a8b5c3c

…ggregate

format

87c1ee0

Merge branch 'main' into spiridonov-rate-aggregate

3cce17d

chaudum reviewed Oct 16, 2025

View reviewed changes

spiridonov added 4 commits October 16, 2025 09:58

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

7216e04

…ggregate

input columns

f57498f

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

32bcbc6

…ggregate

Merge branch 'spiridonov-rate-aggregate' of github.com:grafana/loki i…

beebaba

…nto spiridonov-rate-aggregate

spiridonov requested a review from rfratto October 16, 2025 20:08

spiridonov added 3 commits October 17, 2025 09:14

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

bba1e87

…ggregate

separate join and math expressions

23f20c8

Merge branch 'main' of github.com:grafana/loki into spiridonov-rate-a…

dec254e

…ggregate

	t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-."} \|= "metric.go"[5m]))`, func(t testing.T) {
	t.Run("simple metric query", func(t *testing.T) {

	t.Run(`sum by (level) (count_over_time({cluster="prod", namespace=~"loki-."}[5m]) / 300)`, func(t testing.T) {
	t.Run("binop metric query", func(t *testing.T) {

feat(engine): basic math expressions #19407

Are you sure you want to change the base?

feat(engine): basic math expressions #19407

Uh oh!

Conversation

spiridonov commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaudum left a comment

Choose a reason for hiding this comment

Uh oh!

spiridonov commented Oct 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfratto commented Oct 16, 2025

Uh oh!

spiridonov commented Oct 16, 2025

Uh oh!

rfratto commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

spiridonov commented Oct 7, 2025 •

edited

Loading

rfratto commented Oct 16, 2025 •

edited

Loading